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STABLE LIMITS OF MARTINGALE TRANSFORMS WITH 
APPLICATION TO THE ESTIMATION OF GARCH PARAMETERS 

By Thomas Mikosch^ and Daniel Straumann 
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In this paper we study the asymptotic behavior of the Gaus- 
sian quasi maximum likelihood estimator of a stationary GARCH 
process with heavy-tailed innovations. This means that the innova- 
tions are regularly varying with index a G (2,4). Then, in partic- 
ular, the marginal distribution of the GARCH process has infinite 
fourth moment and standard asymptotic theory with normal limits 
and y/n-iates breaks down. This was recently observed by Hall and 
Yao [Econometrica 71 (2003) 285-317]. It is the aim of this paper 
to indicate that the limit theory for the parameter estimators in the 
heavy-tailed case nevertheless very much parallels the normal asymp- 
totic theory. In the light-tailed case, the limit theory is based on the 
CLT for stationary ergodic finite variance martingale difference se- 
quences. In the heavy-tailed case such a general result does not exist, 
but an analogous result with infinite variance stable limits can be 
shown to hold under certain mixing conditions which are satisfied 
for GARCH processes. It is the aim of the paper to give a general 
structural result for infinite variance limits which can also be applied 
in situations more general than GARCH. 

1. Introduction. The motivation for writing tliis paper comes from Gaus- 
sian quasi maximum likelihood estimation (QMLE) for GARCH (generalized 
autoregressive conditionally heteroscedastic) processes with regularly vary- 
ing noise; we refer to Section 4 for a detailed description of the problem. 
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Recall that the process 

p q 

(1.1) Xt = atZt with = ao + Y^ aiX^_i + ^ Pja^j ,teZ, 

i=i j=i 

is said to be a GARCH(p, q) process [GARCH process of order {p,q)]. Here 
(Zt) is an i.i.d. sequence with EZf = 1 and EZi = 0, and ai,Pj are non- 
negative constants. GARCH processes and their parameter estimation have 
been intensively investigated over the last few years; see [19] for a general 
overview and [28] and the references therein for parameter estimation in 
GARCH and related models. In the context of QMLE, the asymptotic be- 
havior of the parameter estimator is essentially determined by the limiting 
behavior of the quantity [see (4.13)] 

L'M = lt^{z?-i), 
^ t=l 

where is the derivative of the underlying log-likelihood, h[ is the deriva- 
tive of (Tj when considered as a function of the parameter 6, and 6q is the 
true parameter (consisting of the Oi and Pj values) in a certain parameter 
space. In this context, 

Gt= t G Z, 

is a stationary ergodic sequence of vector-valued random variables which is 
adapted to the filtration J^t = cr{Yt-i,Yt-2, ■ ■ ■), t £ 7^, where Yt = Z^ — 1 
constitutes an i.i.d. sequence. 

If Gt has a finite first moment, the sequence (GfYt) is a transform of the 
martingale difference sequence (It), hence, a stationary ergodic martingale 
difference sequence with respect to (Tt)- li E\Gi\'^ < oo and EYi < cxo, an 
application of the central limit theorem (CLT) for finite variance stationary 
ergodic martingale differences (see [4], Theorem 23.1) yields 

n 

n"i/2^G,y,4iV(0,I]), 
t=i 

where 5] is the covariance matrix of GiYi. This result does not require any 
additional information about the dependence structure of (GtYj). It implies 
the asymptotic normality of the parameter estimator based on QMLE. 

If EYi = oo, a result as general as the CLT for stationary ergodic mar- 
tingale differences is not known. However, some limit results for stationary 
sequences with marginal distribution in the domain of attraction of an infi- 
nite variance stable distribution exist. We recall two of them in Section 2. 
Our interest in infinite variance stable limit distributions for J27=i GtYt is 



STABLE LIMITS AND GARCH 



3 



again closely related to parameter estimation for GARCH processes. Re- 
cently, Hall and Yao [16] gave the asymptotic theory for QMLE in GARCH 
models when EZf = oo. To be more specific, they assume regular variation 
with index a S (1,2) for the distribution of Zf. It is our aim to show that 
their results can be obtained by a general limit result for the martingale 
transforms X^tLi GtYj when the i.i.d. noise (Yt) is regularly varying with 
index a £ (1; 2). The key notions in this context are regular variation of the 
finite-dimensional distributions of (Gjlt) and strong mixing of this sequence; 
see Section 2 for these notions. 

Our objective is twofold. First, we want to show that the theories on 
parameter estimation for GARCH processes with heavy- or light-tailed in- 
novations (Zt) parallel each other. We use the recent structural approach to 
GARCH estimation by Berkes et al. [3] in order to show that such a uni- 
fied approach is possible. Second, our approach to the asymptotic theory for 
parameter estimators is not restricted to GARCH processes. In the light- 
tailed case, Straumann and Mikosch [28] extended the approach by Berkes 
et al. [3], including among others AGARCH and EGARCH processes. The 
main difficulty of our approach when infinite variance limits occur is the 
verification of certain mixing conditions. In contrast to the case of asymp- 
totic normality, such conditions cannot be avoided. However, it is difficult to 
check for a given model that these conditions hold; see Section 4.4 in order 
to get a ffavor of the task to be solved. 

GARCH processes and their parameter estimation give the motivation 
for this paper. The corresponding limit theory for the QMLE with heavy- 
tailed innovations can be found in Section 4. Our main tool for achieving 
these limit results is based on asymptotic theory for martingale transforms 
with infinite variance stable limits. This theory is formulated and proved in 
Section 3. It is based on more general results for sums of stationary mix- 
ing vector sequences with regularly varying finite-dimensional distributions. 
This theory is outlined in Section 2. 

2. Preliminaries. In this section we collect some basic tools and notions 
to be used throughout this paper. First we want to formulate a classical re- 
sult on infinite variance stable limits for i.i.d. vector-valued summands due 
to Rvaceva [25]. Before we formulate this result, we recall the notions of 
stable random vector and multivariate regular variation. The class of sta- 
ble random vectors coincides with the class of possible limit distributions 
for sums of i.i.d. random vectors, and multivariate regular variation is the 
domain of attraction condition for sums of i.i.d. random vectors. Then we 
continue with an analog of Rvaceva's result for stationary ergodic vector 
sequences. In this context, we also need to recall some mixing conditions. 
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Stable random vectors. Recall that a vector X with values in M*^ is said 
to be a-stable for some a G (0, 2) if its characteristic function is given by 
^e*(x,x) 

exp(-/ |(x,y)r(l-isign((x,y))tan(7ra/2)) 

X r(c?y) +i(x,/i)|, a /I, 

exp|-^^_^ |(x,y)|(^l + i;^sign((x,y))log|(x,y)|^ 

X r((iy) +i(x,/x) L a = l, 



(2.1) 



where (x, y) denotes the usual inner product in M*^ and | • | the Euclidean 
norm; see [27], Theorem 2.3.1. The index of stability a G (0,2), the spectral 
measure T on the unit sphere S'^"^ and the location parameter fi uniquely 
determine the distribution of an infinite variance a-stable random vector X. 

Multivariate regular variation. If X is a-stable for some a € (0,2), it 
is regularly varying with index a. This means the following. The random 
vector X with values in R"' is regularly varying with index a > if there 
exists a random vector with values in the unit sphere E>'^~^ of M*^ such 
that for any t > 0, as x — > oo, 

(2.2) ^^'p'^i^'^'.^'^ At-POE-), 

-r(|X| > X) 

where for any vector x 7^ 0, 

x = x/|x|, 

and denotes vague convergence in the Borel cr-field of S'^^^; see [22, 23] 
for its definition and details. The distribution of is called the spectral 
measure of X. Alternatively, (2.2) is equivalent to 

(23) ^(^g^-) - . 

P(|X|>x) ^' 

where — > denotes vague convergence in the Borel a-field of M \ {0} and 
// is a measure on the same a-field satisfying the homogeneity assumption 
li{tA) = r"/Li(A) for t > 0. 

Remark 2.1. The property of regular variation of X with index a does 
not depend on the chosen norm. However, the spectral measure (the unit 
spheres §>'^~^ depend on the norm) and the limiting measure can be differ- 
ent for distinct norms. The asymptotic theory of this paper does not depend 
on the particular choice of the norm | • | . Unless specified otherwise, we will, 
however, assume that | • | is the Euclidean norm. 
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To give some intuition on regular variation of a vector X, we mention some 
immediate consequences of the definition. Regular variation of X implies 
that |X| is regularly varying: -P(X| > x) = L(x)x~°^, where L{x) is slowly 
varying in the sense that L[cx) / L{x) — > 1 as x — > oo, for every c > 0. This 
property follows by plugging the set S'^"^ into (2.2). Moreover, relation (2.3) 
implies that every linear combination (a, X), a 7^ 0, of the components of 
X is regularly varying with the same index a. This follows by plugging the 
d-dimensional halfspace {x G : (a, x) > 1} into (2.3). 

Definition (2.2) has an equivalent sequential analog in the following sense. 
Choosing any sequence ^ 00 such that 

(2.4) nP(|X| > an) ^ 1, 
(2.2) is equivalent to 

(2.5) nP(|X| >ta„,XG5)^t-"P(0G5), t > 0, 

for all Borel sets S C S"^"^ with P(0 G dS) = 0. By an application of Pois- 
son's limit theorem, the latter relation implies for an i.i.d. sequence (Xj) with 
the same marginal distribution as X that the binomial random variable 

iV„((t,CX3) X S) 

= E /(t,oo)xs((an'|Xi|, X,)) ^ iV((t, cx)) X 5), 

where the limiting variable is Poisson with parameter t~°^P{@ G S) and Ia 
denotes the indicator function of A. This binomial variable counts those ex- 
ceedances of the scaled lengths a~"^|Xi|, . . . ,a~^|X„| of the vectors Xj above 
the threshold t for which the angles of the Xj's fall into the set S. The dis- 
tributional convergence (2.6) can be extended to the weak convergence of 
the underlying point processes A^^^ toward a Poisson process on M \ {0}, 
/i being its mean measure; we omit the details and refer again to the men- 
tioned literature [22, 23]. However, the limit relation (2.6) already explains 
to some extent what the spectral measure describes (in an asymptotic sense) : 
it gives the likelihood that the angles of the i.i.d. regularly varying vectors 
Xi, . . . ,X„ "far away from the origin" fall into a specified set S. 

The Poisson convergence result (2.6) also tells us what "far away from 
the origin" means: the scaling a„ of the Xj's has to be chosen according to 
the condition (2.4). We see in the sequel that this condition will appear in 
various disguises. Finally, we mention that (2.3) can be written in equivalent 
sequential form with (a„) satisfying (2.4) as 

nP(a-iXG-)^/i(-). 
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Stable limits for sums of i.i.d. random vectors. Now let (Y^) be an i.i.d. 
sequence of random vectors with values in R"^. According to Rvaceva [25], 
there exist sequences of constants > and b„, G such that 

n 
t=l 

for some a-stable random variable Xq with a £ (0, 2) if and only if Yi is 
regularly varying with index a, and the normalizing constants can be 
chosen as 

(2.7) P(|Yi| >o„)~n-\ 

Notice that (2.7) is directly comparable with condition (2.4), which appears 
in the sequential definition of regular variation. 

For a stationary sequence (Y^), a similar result can be found in [13] as 
a multivariate extension of one-dimensional results in [12]. For its formula- 
tion one needs regular variation of the summands and a particular mixing 
condition, called ^(a„), which was introduced in [12]. 

Mixing conditions. We say that the condition A{an) holds for the sta- 
tionary sequence (Yt) of random vectors with values in R'^ if there exists 
a sequence of positive integers r„ such that r„ — > oo, A:„ = [n/r„] ^ oo as 
n ^ oo and 

^^^^ i?exp|-X^/(Yt/a„)|- (^i^exp|-^/(Yt/a„)|j ^0, 

oo, V/ G Qs, 

where Gs is the collection of bounded nonnegative step functions on M'^ \ {0} . 
The convergence in (2.8) is not required to be uniform in /. This is indeed 
a very weak condition and is implied by many known mixing conditions, in 
particular, the strong mixing condition which is relevant in the context of 
GARCH processes; see Section 4. We refer to [13] for a comparison of A{an) 
with other mixing conditions. 

For later use we also recall the definition of a strongly mixing stationary 
sequence (Yt) of random vectors with rate function {(pk) (see [24], cf. [14] 
or [17]): 

sup \P{Ar\B)-P{A)P{B)\=:(i)k^^ asfc^oo. 

Aeo-(Y^,s<0),Becr{Ys,s>fc) 

If ((/)fc) decays to zero at an exponential rate, then (Yt) is said to be strongly 
mixing with geometric rate. In Section 4.4 we use a more stringent notion 
of mixing, called /3-mixing or absolute regularity. It implies strong mixing 
with the same rate function. 
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Stable limits for sums of stationary random variables. The following re- 
sult is a combination of Theorem 2.8 and Proposition 3.3 in [13]. It gives 
conditions under which an a-stable weak limit occurs for the sum process 
of a stationary sequence. In what follows we write 

So = and S„=YiH l-Y„, n > 1, 

and for any Borel set S C M, 

where 

n 

Si^HB) = Y^Y}''hB{\Y}'^\/a^), n>l. 
t=i 



Theorem 2.2. Let (Yt) be a strictly stationary sequence of random 
vectors with values in and the real sequence (a„) be defined by (2.7). 
Assume that the following conditions are satisfied: 

(a) The finite- dimensional distributions of (Y^) are regularly varying with 
index a > 0. To be specific, let vec {e^%...,ef^) be the {2k + l)d- dimen- 
sional random row vector with values in the unit sphere §{2fc+i)c(-i ^/^q^ 
appears in the definition (2.2) of regular variation of vec{Y_k, ... ,Yf:), 
k>0, with respect to the max-norm \ ■ \ in m(2^+^)'^. 

(b) The mixing condition A{an) holds for (Yt). 
(c) 



(2.9) lim limsupP W |Yt| > a„y |Yo| > a„y ) = 0, y > 0, 

where (rn) appears ifi the forfnulatiofi of ^(^dfi). 
Then the limit 



(2.10) 7=^ii^^fief^r- V i^f r) /^i^S 

exists. // 7 > 0, then the following results hold: 
(i) Ifae (0,1), then 

for some a-stable random vector X^. 
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(ii) If a£ [1, 2), and for all 6 > 
(2.11) limlimsupP(|S„(0,y]-^S„(0,y]| >5a„) = 0, 

J/— »0 n— »oo 

then 

a-\Sn-ESn{0,l])^^a 

for some a- stable random vector X^. 

Remark 2.3. The structure of the hmiting vectors Xq is given by some 
functional of the points of a limiting point process. The proof of this result 
makes heavy use of point process convergence results, which are appropri- 
ate tools in the context of regularly varying distributions when extremely 
large values may occur in the sequence (Y^); see [13] for details. This leaves 
the parameters in the characteristic function (2.1) unspecified (with the ex- 
ception of a); a specification is not available so far and requires further 
investigation. 

Remark 2.4. The quantity 7 in (2.4) can be identified as the extremal 
index of the sequence (|Yf|); see [12] and Remark 2.3 in [13]. The extremal 
index 7 S [0, 1] of a strictly stationary real- valued sequence is a number which 
characterizes the clustering behavior of the sequence above high thresholds. 
Roughly speaking, its existence ensures that the approximate relationship 

p( max |Y,| <n„) « P"^(|Yi| < n,) 

holds for suitable sequences Un — > 00). For the definition and interpretation 
of the extremal index, we refer to [18] and [15], Section 8.1. The case 7 = 
corresponds to the case of sequences with unusually large cluster sizes above 
high thresholds. This case is often considered pathological; see [18] for some 
examples and the recent paper by Samorodnitsky [26]. For 7 = the limit 
theory developed in [12, 13] yields that the weak limit results in the above 
theorem hold with zero limit. 

3. Stable limits for martingale transform. In this section we want to 
derive infinite variance stable limits for sums of strictly stationary random 
vectors which have the particular form 

Yt = GtYt, 

where (It) is an i.i.d. sequence and (G^) is a strictly stationary sequence of 
random vectors with values in such that (Gt) is adapted to the filtration 
given by the cr-fields Tt = a{Yt-i,Yt-2, . . . ), t G Z. If EYi = and £^|Gi| < 
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oo, E{GtYt\J^t) =0 a.s., and, therefore, (GtYt) is a martingale difference 
sequence and 

So = 0, S„ = Yi + --- + Y„, n>l, 

is the martingale transform of the martingale {J2t=i ^t)n>o by the sequence (Gf). 
We keep this name even if -BlYij = oo. 

3.1. Basic assumptions. We impose the following assumptions on the 
sequences (Yt) and (Gt): 

A.l. Yi is regularly varying with index a £ (0,2). 
A.2. ^|Gi|"+' < oo for some e > 0. 

A.S. (GtYt) satisfies condition A{an) [see (2.8)], where -Pd^il > a„) ~ n~-^ 
and (rn), defined in (2.8), is such that 

(3.1) nrnl — 

\ a, 

where e is the same as in A. 2. 

Remark 3.1. Regular variation of Yi with index a and the i.i.d. prop- 
erty of (Yt) imply that 



P a„ ^ max lyj < X ^ <I)a(a:;) = e ^ , x > 0, 
V l<t<n / 

for the Frechet distribution <I>q; see [15], Chapter 3. 

In this setting, the heaviness of the tails of the distribution of GiYi is 
essentially determined by the distribution of Yi] see Remark 3.4 below. 

3.2. Main result. We are now ready to formulate our main result on the 
asymptotic behavior of the sum process (S.„). 

Theorem 3.2. Consider the martingale transform 

\t=l / n>0 \t=l / n>0 

defined above. Assume that the conditions A.1-A.3 are satisfied. Moreover, 
if a € (1,2), assume that EYi = and, if a=l, that Yi is symmetric. Then 
the finite-dimensional distributions of (Yt) ^Lf^ regularly varying with index 
a and the limit 7 in (2.4) exists. If j > 0, then 

(3.2) a^"*^ S„ — > Xq, , 

where the sequence (a^) is given by 

P(|yi| >a„)-n"i 
and X„ is an a-stable random vector. 
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Remark 3.3. In the case when £'|Gip+'^ + £;|Yip+'5 < oo and EYi = 0, 

(3.2) turns into n^^^'^Sn — > X, where X is Gaussian with mean zero and 
the same covariance structure as Gi. This follows since (GtVi) is a strictly 
stationary martingale sequence; see [4]. 

Remark 3.4. It is not difficult to see that Yj is regularly varying with 
index a. For the proof we need a result of Breiman [11]. It says that if one 
has two independent random variables ^,7? > a.s., ^ is regularly varying 
with index a > and Er]'^ < oo for some > a, then 

P(er?>2;)~^r/"P(e>x), 

that is, ^,7] is regularly varying with the same index a. Now observe that, 
for t,x>0 and a Borel set S C S'^"^, by multiple application of Breiman's 
result, 

p(|Gi||yi|>te,Giyi/|Gi||yi|eg) 
p(|Gi||yi|>x) 

^ P(|Gi||yi|>te,sign(yi)Gi£5) 

p(|Gi||yi|>x) 

_ P(|Gi|yi >to,Gi £5) P(|Gi|yi < -tx,-Gi e S) 
^'(IGiliyi^x) + P{\G^m\>x) 

^(|Gir/g(Gi))p(yi>to) g(|GirJs(-Gi))p(yi<-te) 

^|Gi|°P(|yi| >x) E\Gi\"P{\Yi\>x) 
Writing for some p,q>0 with p + q = 1 and a slowly varying function L{x), 

P{Yi> x)=pL{x)x~'^ and P{Yi < -x) = qL{x)\x\~'^ , x>0, 
we can read off the spectral measure of the vector Yi: 

(3.3) P(e.s)=pKM^+,«MzGi)), 

^ ^ ^ ' ^ E\Gi\°' E\Gi\" 

By regular variation, Un = n^/"^(n) for some slowly varying function £. By 
Breiman's result and since £'|Gi|"^'^ < oo for some e > 0, it also follows that 

p(|Gi||yi|>x)~i?|Girp(|yi|>x), 

and, therefore, P(|Yi| > co„) ~ for some constant c > 0. Moreover, we 
have 

(3.4) nP(a-iYiE-)^Mi, 

for some measure //i on M.'^ \ {0} which is determined by a and the spectral 
measure. 
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Remark 3.5. It follows from the proof below that 
nP{a-^ (Yi, ...,Yh)£ (i(xi, . . . ,x/,)) 

(3.5) /xi((ixi)eo(d(x2, . . • ,x/j)) H h fii{dxh)eo{d{xi, . . . ,x/,,„i)) 

=:/ih((i(xi,...,x/i)), 
where /ii is defined by (3.4), eq is the Dirac measure at and 
(Yi,...,YO:=vec(Yi,...,Y,,) and 

(3.6) 

(xi,...,x/,) :=vec(xi,...,x,,). 

This means, in particular, that the limiting measure in the definition of 
regular variation for (Yi, . . . , Y/^) is the same as in the definition of regular 
variation for vec(Y^, . . . , Y^), where Y^' are i.i.d. copies of Yi. This part of 
the theorem is valid for any a> 0. 

Proof of Theorem 2.2. We verify the conditions of Theorem 2.2. 
Since A. 3 implies A{an) and since we require 7 > 0, it remains to check (a) 
and (c) in Theorem 2.2. 

(a) Regular variation of the finite- dimensional distributions. We show 
regular variation of the vector (Yi,...,Y/i) defined in (3.6), that is, we 
show that (3.5) holds. 

We restrict ourselves to proof of regular variation of the pairs (Yi, Y2) := 
vec(Yi,Y2); the case of general finite-dimensional distributions is com- 
pletely analogous. The regular variation of Yi was explained in Remark 3.4. 
Let now Bi and B2 be two Borel sets in [0,00]^^ \ {0}, bounded away from 
zero. In particular, there exists M > such that |x| > M for all x£ Bi and 
X G i?2- Then for any e > 0, by intersecting with the events {|Gj| < e} and 
{|G,| >e}, i = l,2, 

{a-^YieBi,a~^Y2eB2} 

C{\Gi\\Yi\>Man,\G2\\Y2\>Man} 

C{e\Yi\ >Man,e\Y2\ > Ma„} 

U{|Gi|/(,,^)(|Gi|)|yi| >Ma„,e|y2| >Ma„} 
U{|G2|/(e,oo)(|G2|)|r2| > Ma„, elFil >Ma„} 
U {|Gi|/(,,oo)(|Gi|)|yi| > Man, |G2|/(,,oo)(|G2|)|r2| > Man} 

=■■ U A. 

1=1 
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By independence and an application of Breiman's result, nP{Di) — > and 
nP{D2) — > 0. Similarly, 

nPps) < nP(|G2|/(,,oo)(|G2|)|y2| > 

~ nP{\Y2\ > Ma„)E(|G2|%,oo)(|G2|)), 
and thus, by Lebesgue's dominated convergence theorem, 

lim limsupnP(L'3) =0, 

and nP{D4) — > can be proved in the same way. We conclude that 

nP(a~^(Yi, Y2) G d(xi,X2)) ^i(dxi)eo(dxi) + /ii((ix2)eo(dx2) 

= ^2(d(xi,X2)); 

see [23]. This proves the regular variation of the two-dimensional finite- 
dimensional distributions. The higher-dimensional case is completely analo- 
gous. 

(c) The condition (2.9). We have for any y > 0, 



Pi max \Gt\\Yt\>yan |Go||iol>yan 

\ k<t<r„ 

<P( max \Gt\>yan/{skar„) \Go\\Yo\ > ya^ 

\k<t<r„ 

+ Pi max \Yt\ > Skaj. 

\k<t<r„' 

= :h+h, 

where (s^) is any sequence such that — > 00. In what follows all calcula- 
tions go through for any y > 0; for ease of notation, we set y = 1. Then, by 
Remark 3.1, 

lim lim I2 = lim (1 - <&a(sfc)) = 0. 

An application of Markov's inequality yields, for some constant c > and 
e > as in A. 2 (here and in what follows, c denotes any positive constant 
whose value is not of interest). 



h < YlP{\Gt\> an/{skarJ\\Go\\Yo\> a„ 



t=k 



P(|Go||yo| >a. 
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an . 

^0 as n — > oo. 
Here we used Breiman's result [11] to show that 

P(|Go||yo| > an) ~ ^|GorP(|lo| > «n), 

condition (3.1) and the fact that ^|Gi|"+'^ < oo; see A. 2. 
Now we turn to 

P( max |Gt||yt| >an |Go||yo| >an 

\-r„«<-fc 



<P max |Gt| > a„/(sfcar„) |Go||lo|>«n 

-rn<t<—k 



+ P[ max \Yt\> Skar„ 

h + u. 



|Go||yb| > «n 



The quantity I^, can be treated in the same way as Ii to show that I3 — > 
a.s. as n — > 00. We turn to 1^. Fix < M < 00. Then 

P(max_r„<t<_fc \Yt\ > Skar„,M\Yo\ > a„) 
I4 < 



P(|Go||yo| >an) 
^(|Go|/(A/,oo)(|Go|)|ro|>a, 



+ 



P(|Go||lo|>«n) 

=: hi + Ia2- 

By independence of the Y^'s, Breiman's [11] result and since r„ 00, 
J P{msix_r„<t<-k \Yt\ > gfcarjM"P(|yo| > a„.) 

^ ii;|Go|-p(|yol>«n,) 

~ c(l — ^aisk)) as n ^ 00 
— > as A; — > 00. 

By virtue of Breiman's [11] result, 

i?(|Go|"I(M,oo)(|Go|))P(|lo|> an) 



42 



^|Go|°P(|yo| >a„) 



Since |Go| has finite moments of order greater than a, an application of the 
Lebesgue dominated convergence theorem yields 

lim lim = 0. 

M^oon—>oo 
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This proves (2.9). □ 

Thus, the conditions (a)-(c) and 7 > of Theorem 2.2 are satisfied. In the 
case a < 1, Theorem 2.2 immediately yields (3.2). In the case a S [1,2), we 

have to check condition (2.11). It suffices to show it for components Sn\o,y], 
i = 1, . . . ,d, of S„(0,y]. Since the components can be handled in the same 
way, we suppress the dependence on i and, for ease of notation, write GtYt 
for the summands of the ith component. 

We start with the case a G (1,2). As before, write J-t = a{Yt^i,Yt^2, ■ ■ ■)■ 
Then, for z > 0, since EYi = 0, 

E[GtYtI^o,z]{\GtYt\/an) \ H = GtE[Yt^Q^,^{\GtYt\/an) \ Gt] 

= -GtE[Yt^,^^)(\GtYt\/an)\Gt]. 

Consider the decomposition 

E [G*^*^,-] (I K) - i?[Giyi/(o,,] ( I GiFi I K)]] 

71 

= a-iE[Gt>"t/(o,.](|Gtlt|K) -Gtii;[yt/(o,.](|Gtyt|/a„) I Gt]] 
t=l 

n 

- a-^Y.'^GtE[Yt^,^^){\GtYt\/ar,) \ Gt] - E[GiYJ(,^^^{\GiYi\/ a^)]] 
t=i 

=:Ti+T2. 

For fixed n, Ti is a sum of stationary mean zero martingale differences. An 
application of Karamata's theorem ([5], page 26) to the regularly varying 
random variable GiYi with index a yields for some constant c > 0, 

var(ri) = na-^E[GiYiI^o^,]{\GiYi\/an) 

-Gii5[yi/(0,,](|Giyi|/a„)|Gi]]' 
(3.7) < cna-^E[GiYiI^o^,]{\GiYi\/anf 

~ cz^""^ as ?i ^ 00 
^0 as z i 0. 

Next we treat T2. Fix < S < M < 00 to be chosen later. Notice that, by 
Karamata's theorem and the uniform convergence theorem for regularly 
varying functions uniformly for c G [5, M] , 

cxP{\Yi\ > cx) 
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for some constant C. Taking this into account, the strong law of large num- 
bers yields, with probability 1, 

n 

a-'Y.GtI[5,M]i\Gt\)mh^,o.)i\GtYt\/an) \ Gt] 
t=i 

n 

= "n^X!^*%A/](|G't|) 
t=l 

(3.8) X [{zaJGt)P{\Yt\>za.J\Gt\ \ Gt)(C + o(l))] 

n 

= {C + o{l))z'~^n~'Y.\GtrM\Gt\) 
t=i 

-^Cz'~-E[\G,ri[s,M]i\Gi\)]- 

On the other hand, since GiI[s^M]{\Gi\)yi is regularly varying with index 
a G (1,2), by the same argument and Breiman's result, 

na-ii?[Gi%M](|Gi|)yi/(,,oo)(|G'iyi|/a„)] 

(3.9) =na;;'[{C + o{l)){zan)P{GiI[s,M]{\Gi\)\Yi\> za^)] 

= {C + o{l))z'~"E[\Giri[s,M]{\G^\)]. 

This shows that (3.8) and (3.9) cancel asymptotically as n ^ (X) for every 
fixed z. 

A similar argument shows that, with probability 1, 

j2GtI[o,s]{\Gt\)E[YtI^,^^^{\GtYt\/a^) \ Gt] 
t=i 

n 

(3.10) <a;;i^|Gi|i[o,5](|G<|)£;[|yi|/(,,^)(<^|yilK)] 

t=i 

^ciz/5)'--E[\G^\I[o,5]{\Gi\)]. 

Moreover, 

na-i|i?[Gi/[o,5](|Gi|)yiI(,,oo)(|Giyi|K)]| 

(3.11) <na;;iii;[|Gi|/[o,5](|Gi|)|yi|/(,,oo)(5|yi|/a„,)] 

^ciz/6)'-'^E[\G^\I[o,s]i\Gl\)]. 

Now choose 6 = z'^. Then, first letting n — > oo and then z [0, both (3.10) 
and (3.11) vanish asymptotically. 



On 
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Finally, we consider 

Y.GtI^M,oo){\Gt\)E[YtI(^,^^^{\GtYt\/an) \ Gt] 
t=i 

<a-ini^[|Gi|/(M,oo)(|G'i|)|yi|/(,,oo)(|G'iyi|K)]. 

An application of Breiman's result to the regularly varying random variable 
GiI[M,oo){\Gi\)Yi gives that the right-hand side is asymptotically equivalent 
as n — > oo to 

czi-"£;[|Gir/[M,oo)(|Gi|)]. 

Choosing M large enough, the right-hand side is smaller than z, say. The 
same argument can be applied to 

na;;^|i?[Gi/[A/,oo)(|G'i|)yi/(,,oc)(|G'iyi|/a„)]|. 

Collecting the bounds above, we see that 

limlimsupP(|T2| > r) = 0, r > 0. 

ziO n— >oo 

This together with (3.7) concludes the proof of (2.11) for a £ (1,2). 

For a = 1, we use the additional condition of symmetry of Yt. Then 
ESn{0,y] = and the same argument as for var(ri) above shows that (2.11) 
holds in this case as well. This concludes the proof of (2.11). 

Since the conditions of Theorem 2.2 are satisfied for a G [1, 2), we conclude 
that 

a-i(S„-^S„(0,l])^X„ 

for some a-stable random vector in MJ^. For a = 1, we can drop ESn{0,y] 
because of the symmetry of GtYt. For a € (1,2), GtYt is regularly varying 
with index a. Since E{GtYf) = 0, Karamata's theorem yields 

a-'ESn{0, l]^b 

for some constant b which can be incorporated in the stable limit, and, 
therefore, centering in (3.2) can be avoided. This concludes the proof of 
Theorem 3.2. 

4. Gaussian quasi maximum likelihood estimation for GARCH processes 
with heavy-tailed innovations. In this section we apply Theorem 3.2 to 
Gaussian quasi maximum likelihood estimation (QMLE) in GARCH pro- 
cesses. The limit properties of the QMLE were studied by Berkes et al. [3]. 
They proved strong consistency of the QMLE under the moment condition 
E\Zi\'^~^^ < oo for some 6 > and established asymptotic normality under 
EZf < oo. Here (Zt) is an i.i.d. innovation sequence; see Section 4.1 below 



:'E 
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for the definition of the GARCH model and the QMLE. Hall and Yao [16] 
refined these results and also allowed for innovations sequences, where 
is regularly varying with index a € (1,2). Then the speed of convergence is 
slower than the usual -y/n rate and the limiting distribution of the QMLE is 
(multivariate) a-stable. 

It is our objective to show that the asymptotic theories for the QMLE 
under light- and heavy-tailed innovations parallel each other and that very 
similar techniques can be applied in both cases. However, in the light-tailed 
case (see [3]) an application of the CLT for stationary ergodic martingale 
differences is the basic tool which establishes the asymptotic normality of 
the QMLE. In the heavy-tailed situation one depends on an analog of the 
CLT which is provided by Theorem 3.2. 

As a matter of fact, the structure of the proofs shows that the asymptotic 
properties of the QMLE are not dependent on the particular structure of 
the GARCH process if one can establish the regular variation of the finite- 
dimensional distributions of the underlying process {Xt) and the mixing 
condition A{an)- Therefore, the results of this section have the potential to 
be extended to more general models, including, for example, the AGARCH 
or EGARCH models whose QMLE properties in the light-tailed case are 
treated in [28]. The most intricate step in the proof is, however, the verifica- 
tion of this mixing condition for a given time series model. We establish this 
condition for a GARCH process by an adaptation of Theorem 4.3 in [21]; 
this yields strong mixing with geometric rate of the relevant sequence. We 
devote Section 4.4 to the solution of this problem. 

Before we start, we introduce some notation, li K <ZW^ is a. compact set, 
we write €(^,1^"^') for the space of continuous M"^' -valued functions equipped 
with the sup- norm = sup^g^ |t;(s)| . The space C{K,W^^^'^'^) consists 

of the continuous di x (i2-matrix valued functions on K; work 
with the operator norm induced by the Euclidean norm | • |, that is, 

||A|| = sup |Ax|, AeM.'^^'"^\ 

\x\=l 

4.1. Definition of the QMLE. Recall the definition of a GARCH(p, q) 
process {Xt) from (1.1). As before, {Zt) is an i.i.d. innovation sequence with 
EZl = 1 and EZi = 0, and ai,Pj are nonnegative constants. GARCH pro- 
cesses have been intensively investigated over the last few years. Assump- 
tions for strict stationarity are complicated: they are expressed in terms 
of Lyapunov exponents of certain random matrices; see [6] for details. A 
necessary condition for stationarity is 

(4.1) /3i + --- + /3,<l 

(Corollary 2.3 in [6]). We will make use of this condition later. 



18 



T. MIKOSCH AND D. STRAUMANN 



In what follows we always assume strict stationarity of the GARCH pro- 
cesses. As a matter of fact, the observation Xt is always a measurable func- 
tion of the past and present innovations {Zt, Zt-i, Zt-2, ■ ■ ■); hence, (Xt) is 
automatically ergodic. 

In what follows we review how an approximation to the conditional Gaus- 
sian likelihood of a stationary GARCH (p, q) process is constructed, that is, 
a conditional likelihood under the synthetic assumption Zt i.i.d. ~ AA(0, 1). 
Given Xq, . . . , X-p+i and cJq, . . . , a'^g_^_l, the random variables Xi,. . . , Xn 
are conditionally Gaussian with mean zero and variances ht{0), t = 1, . . . ,n, 
where 6 = (ao, ai, . . . , ap, Pi,. . . , Pg)'^ denotes the presumed parameter and 



hti9) 



a1, t<0, 



ao + "i^t-i H ^ oip^t 



H-p 

+ Piht^i{e) + --- + Pght-g{e), t>0. 
The conditional Gaussian log-likelihood has the form 

log/6»(-'^l, ...,Xn \ Xq, . . . ,X_p+i,CJo, . . • 

(4.2) 



-^M2.)-^L^^+iog^.(^: 

Since Xq, . . . , X^p^i are not available and the squared volatilities a^,. . . , cr'^g+i 
unobservable, the conditional Gaussian log-likelihood (4.2) cannot be nu- 
merically evaluated without a certain initialization for (Jq, . . . ,a'^p_^_i and 
Xq, . . . , The initial values being asymptotically irrelevant, we set the 

Xt^s equal to zero and ht{6) = ao/{l — Pi — ■ ■ ■ — Pq) for t < 0. We arrive at 

(ao/{l-Pi Pg), t<0, 

(4.3) ht{e) = lao + aiXti + ■■■ + amin(p,t-i)^Lx(t-p,i) 

[ +piht^i{e) + --- + Pght-g{e), t>o. 

The function {ht{6)Y/'^ can be understood as an estimate of the volatil- 
ity at time t and under parameter hypothesis 0. It can be established that 
\ht — ht\ ^ Q with a geometric rate of convergence and uniformly on the 
compact set K defined in (4.4) below. This suggests that, by replacing ht{0) 
by ht{0) in (4.2), we obtain a good approximation to the conditional Gaus- 
sian log-likelihood. Since the constant — rilog(27r)/2 does not matter for the 
optimization, we define the QMLE 0„ as a maximizer of the function 

L„(0) = Y^i,{e) = --Y,[j-j- + \og hie: 
t=l t=l^'^t{o) 

with respect to 6 ^ K, with K being the compact set 

(4.4) K = {ee MP+«+^ \ m<ai,Pj<M,Pi + --- + Pg< /?}, 
where < m < M < oo and < /3 < 1 are such that qm < (5. 
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Remark 4.1. Prom a comparison with [3], one might think at first sight 
that our definition of the QMLE is different from theirs. To see that ht 
coincides with wt in [3], introduce the polynomials 



a.{z) = aiz + • • ■ + UpZ^ and /3(z) = 1 — Piz — ■ ■ ■ — j3qz'^ 

for every 9 = (ao, ai, • ■ . , dp, I3i,...,l3gf eK. Then one can show by induc- 
tion on t that 

(4.5) htie) = ^^+Y.i:,{9)Xlj, 
where the coefficients ipj{0) are defined through 

(4.6) ii-g^'^C"^'- 

Note that the latter Taylor series representation is valid because Pi>0 and 

/?! H \-l3q< P <1 imply (3{z) / on for |z| < 1 + e and e > sufficiently 

small. We choose (4.3) rather than (4.5) as a first definition for the squared 
volatility estimate under parameter hypothesis 0, because the recursion (4.3) 
is natural and computationally attractive. In [3] the starting point for the 
definition of the QMLE is Theorem 2.2, which says that for alH € Z one has 
ht{9o) = o"!, where Oq is the true parameter and 

oo 

(4.7) h,{e) = ^^+Y.^^{e)xl^. 

In [3] this leads to the definition of a squared volatility estimate at time t 
under parameter based on {Xi, . . . , X„), which is given by (4.5). Note also 
that {ht{6)) obeys 



ht+i{9) =ao + aiX^ H h Op-'^j+i- 



-p 



(4.8) 

+ [5iht{e) + --- + Pqht+i-q{e), oeK. 



4.2. Limit distribution in the case EZf < oo. First we list the conditions 
employed by [3] for establishing consistency and asymptotic normality of On ■ 
Write Oo = (ag, a^, . . . , a°, Pi,. . . , P°)'^ for the true parameter. 

C.l. There is a 5 > such that E\Zi\'^+^ < oo. 
C.2. The distribution of \Zi\ is not concentrated in one point. 
C.3. There is a /i > such that P{\Zi\ <t) = o(t/') as 0. 
C.4. The true parameter Oq lies in the interior of K. 

C.5. The polynomials a°{z) = afz + • • • + a°pZP and p°{z) = 1- p^z 

P°z'^ do not have any common roots. 
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Now we are ready to quote the main result of [3]. We cite it in order to 
be able to compare the assumptions and assertions both in the light- and 
heavy-tailed cases; see Theorem 4.4 below. 

Theorem 4.2 (Theorem 4.1 of [3]). Let (Xt) be a stationary GARCH(p, q) 
process with true parameter vector Oq. Suppose the conditions C.1-C.5 hold. 
Then the QMLE On is strongly consistent, that is, 

If, in addition, EZq < oo, then On is also asymptotically normal, that is, 

V^(0„ - 0o) ^AA(0,Bo lAoBo 1), 
where the (p + ^ + l) x {p + q + 1) matrices Aq and Bq are given by 

Ao = ^^^f^E(^j-MOofh[ieo)), 

(4.9) 

^o = -\E(^j^h[ieofh[ieQ)y 

4.3. Limit distribution in the case EZf = oo. First we identify the limit 
determining term for the QMLE. To this end, we set analogously to [3], 

i.w=g^.w = -5E(^+iog/.w) 

and define 0^ as a maximizer of -L„ with respect to 6 G K. It is a slightly 
simpler problem to analyze 6^ because (it) is stationary ergodic, in contrast 
to {^t)teN- As is shown in Proposition 4.3 below, 9n and On are asymptoti- 
cally equivalent. It turns out that the asymptotic distribution of the QMLE 
is essentially determined by the limit behavior of L'n{6Q)/n, up to multi- 
plication with the matrix — Bq ^. These results follow by a careful analysis 
of the proofs in [3]. We omit details and refer to the website [20] for a de- 
tailed proof. Compare also with the similar reference [28], where the case of 
processes with a more general volatility structure than GARCH is treated. 

Proposition 4.3. Let (Xt) be a stationary GARCH(p, g) process with 
true parameter vector 6q. Suppose the conditions C.1-C.5 apply. If there is 
a positive sequence {xn)n>i wUh Xn = o{n) as oo and 



(4.10) 
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for an MP'^'^'^ -valued random variable D, then the QMLE On satisfies the 
limit relation 

(4.11) x„(0„-0o)^-Bo'D, 
where Bq is given by (4.9). 

Now we can state the main theorem of this section. We note once again 
that Hall and Yao [16] derived the identical result by means of different 
techniques. 

Theorem 4.4. Let (Xt) be a stationary GARCH(p, g) process with true 
parameter vector Oq. Suppose that Zf is regularly varying with index a £ 
(1,2) and that C.3-C.5 hold. Moreover, assume that Z\ has a Lebesgue den- 
sity f , where the closure of the interior of the support {/ > 0} contains the 
origin. Define (xn) = {na~^), where 

P{Zf > On) ^ , n^oo. 
Then the QMLE 6 n is consistent and 

(4.12) a;„(0^_6>o)4D„, n^oo, 
for some nondegenerate a-stable vector D^. 

Before proving the theorem, we discuss its practical consequences for pa- 
rameter inference: 

• The rate of convergence Xn has — roughly speaking — magnitude n^~^/", 
which is less than The heavier the tails of the innovations, that is, the 
smaller a, the slower is the convergence of 0„ toward the true parameter 
Oo. 

• The limit distribution of the standardized differences {On — Oq) is a-stable 
and, hence, non-Gaussian. The exact parameters of this a-stable limit are 
not explicitly known. 

• Confidence bands based on the normal approximation of Theorem 4.2 are 
false if EZf = oo. 

• By the definition of a GARCH process, the distribution of the innovations 
Zf is unknown. Therefore, assumptions about the heaviness of the tails of 
its distribution are purely hypothetical. As a matter of fact, the tails of the 
distribution of Xt can be regularly varying even if Zt has light tails, such as 
for the normal distribution; see [2]. Depending on the assumptions on the 
distribution of Zi , one can develop different asymptotic theories for QMLE 
of GARCH processes: asymptotic normality as provided by Theorem 4.2 
or infinite variance stable distributions as provided by Theorem 4.4. 
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Proof of Theorem 4.4. The proof follows by combining Theorem 3.2 
and Proposition 4.3. Indeed, setting 



is a martingale transform. Regular variation of with index a G (1,2) im- 
plies A.l, but also C.l and C.2. Condition A. 2 is fulfilled because ||x 
has finite moments of any order (Lemma 5.2 of [3]), and so has ||Gi||. The 
condition A. 3 holds if we can show that (Y^) is strongly mixing with geo- 
metric rate, in which case we choose r„ = in A{an) for any small 5 > 0, so 
that (3.1) immediately follows. This choice of (r„) is justified by the argu- 
ments given in [2]. The strong mixing condition with geometric rate of (Y^) 
will be verified in Section 4.4. 

Finally, we have to give an argument for 7 > 0. The latter quantity has 
interpretation as the extremal index of the sequence (|Yi|); see Remark 2.4. 
According to Theorem 3.7.2 in [18], if 7 = and for some sequence {un) 
the relation lim inf„^oo -P(^n < Un) > holds, then one neccessarily has 
limn->oo -P(^n < Un) = 1- Here Mn = max(|Yi|, . . . , |Y„|) and (M^) is the 
corresponding sequence of partial maxima for an i.i.d. sequence {Ri), where 
i?i has the same distribution as |Yi|. 

We want to show by contradiction that 7 = 0, using the above result. The 
random variable |Yi| = i?j is regularly varying with index a since Yi is reg- 
ularly varying with index a. Hence, {a~^Mn) has a Frechet limit distribution 
^a{x) = exp{— 2; > 0; see Remark 3.1. 

On the other hand, we will show that P{Mn < xa„) — > 1 does not hold for 
any positive x, thus contradicting the hypothesis 7 = 0. Indeed, straightfor- 
ward arguments exploiting 



Gt = K{eo)/al 



Yt = [Zf-l)/2 and Yt = Gty*, 



one recognizes that 



(4.13) 



1 00 ,///) \ n 

^ t=l t=l 



00 



E 



f5{z) 



z\ < 1 



for all i = 1 



p, show that 
dai 



(4.14) 



> 



for all i = 0, . . . 



and 



p 



dhtje) 

dai 



(4.15) 




ht{e). 
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Since the Euclidean norm is equivalent to the 1-norm |x| = J2^=i~^^ 1^*1 ^^'^ 
ai < M on K, there is a c > such that 



me) 



p 



ht{e) - ht{0) 



dhtiO) 



A dht{e) 



Note that the last two equalities in the latter display are a consequence 
of (4.14) and (4.15). In particular, we proved that |Gi| > c for all i and 
therefore 

P{Mn<xan) = P{ max |G.j| |yi| < xa^ I 

\fi=l,...,n / 

<P( max \Yi\<c~^xar 

\i=l,...,n 

The same classical limit result for maxima as above ensures that the right- 
hand side probability converges to a Frechet limit and is never equal to 1 
for all positive x. Thus, we have proved 7 > 0. 

Now, all conditions of Theorem 3.2 are verified so that 

2a-iL;(0o) = 2x„^^^4D„, 
n 

where Dq is a- stable [notice that P((Z^ - l)/2 > a„/2) ~ P{Z^ > a„) ~ 
n~^]. Since Xn/n = 0, Proposition 4.3 implies 

XniSn — Gq) —2 ^Bq^Dq,=Dq,. 

Recalling that a linear transformation of an a-stable random vector is again 
a-stable (see [27]), we conclude the proof of the theorem. □ 



4.4. Verification of strong mixing with geometric rate of (Yt)- To begin 
with, we quote a powerful result due to Mokkadem [21], which allows one to 
establish strong mixing in stationary solutions of so-called polynomial linear 
stochastic recurrence equations (SREs). A sequence (Y^) of random vectors 
in R'^ obeys a linear SRE if 

(4.16) Yt = PtYt-i + Qt, 

where ((Pf,Qf)) constitutes an i.i.d. sequence with values 
A linear SRE is called polynomial if there exists an i.i.d. sequence (e^) in 
such that Pj = P(et) and Qj = Q(ej), where P(x) and Q(x) have entries 
and coordinates, respectively, which are polynomial functions of the coordi- 
nates of X. The existence and uniqueness of a stationarity solution to (4.16) 
has been studied by Brandt [10], Bougerol and Picard [7], Babillot et al. [1] 
and others. The following set of conditions is sufficient: ii^log"^ II Pi II < ^1 
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Elog^ IQil < oo, and the top Lyapunov coefficient associated with the op- 
erator sequence (Pj) is strictly negative, that is, 

(4.17) p = mi{r^E\og\\Pt---'Pi\\ | t > 1} < 0. 

Here || • || is the operator norm corresponding to an arbitrary fixed norm 
I • I in W^, for example, the Euclidean norm. The following result is a slight 
generalization of Theorem 4.3 in [21]; see the beginning of the proof below 
for a comparison. 

Theorem 4.5. Let (e^) be an i.i.d. sequence of random vectors in M*^'. 
Then consider the polynomial linear SRE 

(4.18) Yt = P(et)Yt_i + Q(et), 

where P(et) is a random dxd matrix and Q{et) a random W^-valued vector. 
Suppose: 

1. P(0) has spectral radius strictly smaller than 1 and the top Lyapunov 
coefficient p corresponding to (P{et)) is strictly negative. 

2. There is an s > such that 

^||P(ei)||^ < cx) and S|Q(ei)|'' < oo. 

3. There is a smooth algebraic variety V C M'^' such that ei has a density 
f with respect to Lebesgue measure on V . Assume that is contained in 
the closure of the interior of the support {/ > 0}. 

Then the polynomial linear SRE (4.18) has a unique stationary ergodic so- 
lution (Yt) which is absolutely regular with geometric rate and consequently 
strongly mixing with geometric rate. 

Remark 4.6. As regards the definition of a smooth algebraic variety, 
we first introduce the notion of an algebraic subset. An algebraic subset of 
the is a set of the form 

y = {x G M"'' I Fi(x) = • • • = F^(x) = 0}, 

where Fi, . . . ,Er are real multivariate polynomials. An algebraic variety is an 
algebraic subset which is not the union of two proper algebraic subsets. An 
algebraic variety is smooth if the Jacobian of F = {Ei , . . . , Er)'^ has identical 
rank everywhere on V. Examples of smooth algebraic varieties in are 
the hyperplanes of or V = M.'^ . 

Remark 4.7. Recall that absolute regularity (or (3-mixing) is a mixing 
notion which is slightly more restrictive than strong mixing: 

e( sup \P{B\a{Ys,s<0))-P{B)\]=:bk^O, k^oo. 

\Bea{Yt,t>k) / 
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Indeed, /3-mixing implies strong mixing with the same rate function; see [14] 
for details on mixing. 

Proof of Theorem 4.5. If £'||P(ei)||'* < 1 for some s > 0, we can im- 
mediately apply Theorem 4.3 in [21]. In the general case, we use Mokkadem's 
result to prove absolute regularity with geometric rate for some subsequence 
(Yt) = {Ytm)tez, some m > 1, by observing that (Yf) satisfies the linear 
SRE (4.19) below. The subsequence argument works because the mixing co- 
efficient bk is nonincreasing and since (Yt) is a Markov process. Then one 
has the simpler representation 

bk = E( sup \P{B\a(Yo))-Pm); 

VBG<7(Yfe+i) / 

see, for example, [9]. 

Since p <0, there is an m > 1 with i?log ||P(em) • • • P(ei)|| < 0. From 
the fact that the map u ^ E\\P{em) ■ ■ ■ P(ei)||" has first derivative equal to 
Elog ||P(em) • • •P(ei)|| at M = 0, we deduce that there is an < s < s with 
£;||P(e„) • • •P(ei)f' < 1. Then note that (Yt) = (Yt„) obeys a linear SRE: 

(4.19) Yt = P(et)Yt_i + Q(et), 

where 




and 

P(et) = Pietm) ■ ■ ■ P(e(t_i)^+i), 

_ rn-l / j \ 

Q(et) = Q(ejm) + XI n P(etm+i-j) ) Qietm-j)- 
j=l \i=l ) 

Since both the matrix P(ej) and the vector Q(ej) are polynomial functions of 
the coordinates of et and the sequence (ej) is i.i.d., (Y^) obeys a polynomial 
linear SRE. Observe that P(0) = (P(0))™ has spectral radius strictly smaller 
than 1, that £'||P(ei)||* < 1 and £^||Q(ei)||* < oo and that ei has a density 
with respect to Lebesgue measure on y", where is a smooth algebraic 
variety (see A. 14 in [21]). Thus, an application of Theorem 4.3 in [21] yields 
that (Yj) is absolutely regular with geometric rate. This proves the assertion. 
□ 



The following two facts will also be needed. 
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Lemma 4.8. Let (Pt) he an i.i.d. sequence of kxk matrices with E\\Pi\Y < 
oo for some s > 0. Then the associated top Lyapunov coefficient p <0 if and 
only if there exist c> 0, s > and X <1 so that 

(4.20) £;||Pt---Pif'<cA*, t>l. 

Proof. For the proof of necessity, observe that there exists n > 1 such 
that £^log ||P„ • • - Pill < 0. From the fact that the map u E\\Fn ■ ■ - Pill" 
has first derivative equal to Elog ||P.„ • • • Pi|| at u = 0, we deduce that there 
is an s > with -BHPn • • • Pi ||'' = A < 1. Since the operator norm || ■ || is 
submultiplicative and the factors in P^ • • - Pi are i.i.d., 

^llPf-Pill" < A*/""^f max E\\Pe---'Pif] <cX\ t>l, 

\£=l,...,n-l ) 

for c = A~^(max^=i i^HP^ • • • PiU'*) and A = A^/". Regarding the proof 
of sufficiency, use Jensen's inequahty and hmt-^oo t~^E\og ||Pt • • • Pi || = /) to 
conclude 

1 - 1 

p= Hm — Slog ||Pt-- -Pill* <hmsup — log^||Pt---Pi||^ 

t^oo tS t^oo ts 

< lim sup — (log c + t log A) = °^ < 0. 

t — ^oo ts S 

This completes the proof of the lemma. □ 
Lemma 4.9. Suppose that 

(4.21) P*=('b* ^'■".^'"'0' tGZ, 



t 

forms an i.i.d. sequence of k x k matrices with £'||Pi||'' < cxd, s > 0, where 
At e M''^^ Bt E m('=-^)x'- and Ct G Rik-r)x{k-r) _ j^f^^^ associated top 
Lyapunov coefficient pp < z/ and only if the sequences (At) and (Ct) have 
top Lyapunov coefficients pa < and pc <0. 

Proof. For the proof of sufficiency of pA < and pc < for pp < 0, it 
is by Lemma 4.8 enough to derive a moment inequality of the form (4.20) 
for (Pt). By induction we obtain 

^^■■■^^=[ Q, Q-.-Ci 

where 

Qt = Bj At_i • • • Ai + CtBt-iAt-2 • • • Ai + QCt_iBt_2 At_3 • • • Ai 
+ ... + Q...C3B2Ai + Q---C2Bi. 
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Observe that 

max(||Af-Ai||,||Q---Ci||) 

(4.22) 

< \\Pt • • • Pi II < II At • • • Aill + HQ • • • Ci II + IIQt ||. 

It is sufficient to show (4.20) for each bfock in the matrix Pt • • - Pi. Because 
of PA < 0, PC < and £"11 Ai H"*, i?||Ci ||* < E'||Pi||'* < oo, Lemma 4.8 aheady 
imphes moment bounds of the form (4.20) for (A^) and (Ct). Thus, we are 
left to bound ||Qt||. Without loss of generality, we may assume that the 
constants A < 1 and s, c > in (4.20) are equal for (A^) and (Ct) and that 
s < s < 1 . From an application of the Minkowski inequality and exploiting 
the independence of the factors in each summand of Q^, we obtain the 
desired relation 

^IIQtll' < c^t^llBill^'A*^^ < cA*, 

for some A G (A, 1), c > 0. For the proof of necessity, assume pp < 0. Then 
the left-hand side estimates in (4.22) and Lemma 4.8 imply that pA < and 
pc<0. □ 

We now exploit Theorem 4.5 in order to establish strong mixing with 
geometric rate of the sequence (Yf) = (Gflt), where Gt = h[{Oo)/a^ and 
Y, = {Zl - l)/2. 

Proposition 4.10. Let [Xt) he a stationary GARCH(p, g) process with 
true parameter vector 9q. Moreover, assume that Z\ has a Lebesgue density 
f , where the closure of the interior of the support {/ > 0} contains the 
origin. Then (Yj) is absolutely regular with geometric rate. 

Proof. For the proof of this result, we first embed (Y^) in a polynomial 
linear SRE. Without loss of generality, assume p,q>3. Write 

2 2 x^2 x^2 

• • • ' ^t~q+2^^t 1 • • • 1 ^t-p+2' 

dht+i{6o) dht-q+2{Oo) dht+i{6o) 9/it_q+2(0o) 



oao oao oap oap 

dht+i{Oo) dht^g+2{Oo) dht+iiOo) 5/it_g+2(^o) 



Since Zf = /a^, we have 

a{Yt,t>k)Ca{Yt,t>k) and cr(Yt, t < 0) C cj(Yt, t < 0). 

Consequently, it is enough to demonstrate absolute regularity with geometric 
rate of the sequence (Y^). We introduce various matrices. Write 0rf^xd2 
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the di X d2 matrix with ah entries equal to zero and let 1^ denote the identity 
matrix of dimension d. Then set 



Tt 







(g-l)xl 
Olxl 







{g-l)x(p~2) 0(g„i)xl 



\0(p_2)x(g-l) 0(p„2) 



xl 



Olx(p-2) 
0(p-2)x(p-2) 



Oi 







xl 
(p-2)x 



where 



Tt = (/3i° + a?Z,^/3°,...,/3° )g: 



it 

Moreover, define 

M2{Zt) 



{zlo,...,o)eR''-\ 



/Ogx(p+</-l)\ 

Ui 



V 



and M4 




where Ui G R^xCp+g-i) and Yj G Rexb+^-i) are given by 



i > 2, 



Here S. denotes the Kronecker symbol. Also introduce the q x q matrix 



7-1 







(g-l)xl 



and let 



M3 = diag(C,p+l), M5 = diag(C,g) 



be the block diagonal matrices consisting of p + 1 (or q) copies of the block 
C. Finally, we define 



M2{Zt) 
M4 







M3 

(}2x(p+l)lJ 







{p+l)gX(j2 



and Q G RP+i~'^+iiP+i+'^) by [Q]^. = ao6k,i + 6k,p+q. Differentiating both sides 
of (4.8) at the true parameter 6 = 6q, we recognize that 

K+ii^o) = (1' • • • 1 ^t+i-p^ ' • • • ' ^t+l-q)^ 
+ /3i°/i;(0o) + --- + /3°/i;+i-,(0o). 
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From this recursive relationship together with cr^^i = Oq + a^Xf + • • • + 
a°X^^i_p + Pla^ + • • • + (5°a'^_^i_q, we derive a polynomial linear SRE for 

(Yi): 

(4.23) Yt = P(^j)Yt_i + Q. 

The proof of Proposition 4.10 follows from the following lemma. □ 

Lemma 4.11. Under the assumptions of Proposition 4.10, the polyno- 
mial linear SRE (4.23) has a strictly stationary solution (Yf) which is ab- 
solutely regular with geometric rate. 

Proof. The aim is to show that (4.23) obeys the conditions of Theo- 
rem 4.5. Since EZl = 1, it is immediate that £^||P(Zi)|| < co since this state- 
ment is true for the Frobenius norm and all matrix norms are equivalent. 
Treat the blocks Mi(Zt), M3 and M4 separately. Observe that the matrix 
Mi(Zt) appears in the linear SRE for the vector Sf = (o"j^_]^, . . . ,crj_q^2)^f > 
•••>^t%+2)^' namely, 

St = Mi(Zi)St_i + (a5,0,...,0)^. 

Theorem 1.3 of [6] says that (1.1) admits a unique stationary solution if 
and only if (Mi(Zt)) has strictly negative top Lyapunov coefficient; conse- 
quently, pMi < 0. Moreover, arguing by recursion on p and expanding the 
determinant with respect to the last column, it is easily verified that Mi (0) 
has characteristic polynomial 

det(AWi - Mi(0)) = A^+'^-i - E/^^A"'^ ■ 

Since (4.1) holds for a stationary GARCH(p, q) process, by the triangle in- 
equality 



i=l 



>i-EA°a~^>i-EA°>o 

i=l i=l 



if |A| > 1 and, hence, Mi(0) has spectral radius < 1. Observe that the build- 
ing block C has characteristic polynomial 

det(AI,-C) = A«(^l-EA°A"*), 

showing that its spectral radius is strictly smaller than 1 (use the same 
argument as before). Thus, the deterministic matrices M3 and M5 have 
spectral radius < 1, which also implies that their associated top Lyapunov 
coefficients are stricly negative. Combining these results, we deduce that 



30 



T. MIKOSCH AND D. STRAUMANN 



P(0) has spectral radius < 1 and conclude by twice applying Lemma 4.9 
that (P{Zt)) has strictly negative top Lyapunov coefficient. Hence, by The- 
orem 4.5 the stationary sequence (Yj) is absolutely regular with geometric 
rate. □ 

Remark 4.12. Since {X^.al) is asubvector of Yj, stationary GARCH(j3,g) 
processes are absolutely regular with geometric rate; this result has previ- 
ously been established by Boussama [8] . 
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